Skip to content

feat: support void method optimization in Java pipeline#1980

Open
HeshamHM28 wants to merge 4 commits intomainfrom
cf-java-void-optimization
Open

feat: support void method optimization in Java pipeline#1980
HeshamHM28 wants to merge 4 commits intomainfrom
cf-java-void-optimization

Conversation

@HeshamHM28
Copy link
Copy Markdown
Contributor

Summary

  • Fix Java void method instrumentation to generate compilable code for both static and instance void methods
  • For static void calls (e.g., InPlaceSorter.bubbleSortInPlace(arr)), serialize the arguments instead of the class name, which is not a value and can't be cast to Object
  • For instance void calls (e.g., sorter.bubbleSortInPlace(arr)), serialize the receiver object to capture side effects
  • Add void-specific discovery, comparison, and assert-removal support across the Java pipeline
  • Add e2e test and CI workflow for void function optimization

HeshamHM28 and others added 2 commits April 2, 2026 11:55
Discover void methods, instrument them by serializing the receiver instead
of a return value, and treat all-null comparisons as equivalent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Apr 3, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 3, 2026

Claude finished @HeshamHM28's task in 7m 40s —— View job


PR Review Summary

  • Triage PR scope — LARGE (131 lines of production code across instrumentation.py, remove_asserts.py, functions_to_optimize.py)
  • Run lint and typecheck
  • Resolve stale review threads — none found
  • Review code changes
  • Duplicate detection
  • Test coverage analysis
  • Check optimization PRs

Prek Checks

ruff and formatting: pass

mypy flagged 2 errors in codeflash/discovery/functions_to_optimize.py (lines 200, 455 — argument type mismatches on discover_functions(file_path, criteria)). These are pre-existing on the base commit and not introduced by this PR.


Code Review

Bug: Method-call receivers are re-evaluated inside the serialize call

codeflash/languages/java/instrumentation.py:356–362

is_static_call = receiver != "this" and receiver[:1].isupper()
parts: list[str] = []
if not is_static_call:
    parts.append(receiver)
parts.extend(arg_texts)
if parts:
    serialize_target = f"new Object[]{{{', '.join(parts)}}}"

receiver is the raw source text extracted by tree-sitter. For chained receivers like container.getItems().sort(data), tree-sitter sets receiver = "container.getItems()", so the generated Java becomes:

container.getItems().sort(data);
_cf_serializedResult = serialize(new Object[]{container.getItems(), data});

getItems() is called a second time during serialization. If getItems() has side effects, returns a different instance, or is expensive, this produces wrong or expensive serialization. The correct approach is to capture the receiver reference before the call:

SomeType _cf_recv = container.getItems();
_cf_recv.sort(data);
serialize(new Object[]{_cf_recv, data});

This is a real correctness issue for non-trivial receiver expressions. Simple local-variable receivers (sorter.sort(data)) are unaffected.

Fix this →

Design: is_static_call heuristic has a documented limitation worth noting

codeflash/languages/java/instrumentation.py:356

is_static_call = receiver != "this" and receiver[:1].isupper()

This works correctly for the common case (class names start uppercase, local variables start lowercase) but silently misclassifies:

  • Instance fields with unconventional uppercase names → treated as static → receiver omitted from serialization → weaker comparison
  • Chained method receivers (covered above, the real bug)

Not asking for a rewrite, but a short comment like # Heuristic: Java classes conventionally start uppercase, instances lowercase would help the next person understand the intent and its limits.

Minor: Dead code in non-precise mode void path generates invalid Java

codeflash/languages/java/instrumentation.py:439–440

elif is_void:
    replacement = f"{bare_call_stmt} {serialize_stmt}"

serialize_stmt in the non-precise path is _cf_serializedResult{iter_id} = ... (an assignment without a declaration). _cf_serializedResult{iter_id} is never declared in non-precise mode, so this code would produce a Java compilation error if ever reached. In practice this path is dead — wrap_target_calls_with_treesitter is only called from _add_behavior_instrumentation which always passes precise_call_timing=True. Either remove the dead branch or add a declaration so it would be valid if called with precise_call_timing=False.

Minor: verification_type stored but unused in Comparator

The verification_type column is populated with "void_state" vs "function_call" in the instrumentation, but Comparator.java queries all rows without filtering or branching on this field. The comparison works correctly because both original and candidate runs use the same serialization strategy — but a comment explaining that verification_type is for observability/debugging only (or future use) would avoid confusion for reviewers wondering whether the Comparator handles void_state differently.


Duplicate Detection

No duplicates detected. The void method logic is cleanly isolated in instrumentation.py and does not replicate anything in the Python or JavaScript modules.


Test Coverage

codeflash/languages/java/instrumentation.py: 83% — good. The new TestVoidMethodInstrumentation class covers the main happy-path scenarios.

The dead code path identified above (non-precise mode, void, line 439) is also not covered, which is consistent with it being unreachable from production callers.


Optimization PRs

PR #1954 (codeflash/optimize-pr1949-2026-04-01T17.42.23) is open against cf-1082-benchmark-noise-floor with CONFLICTING state. It is 2 days old (created 2026-04-01) — the 3-day threshold for auto-closing has not been reached. Leaving open.


@codeflash-ai
Copy link
Copy Markdown
Contributor

codeflash-ai bot commented Apr 3, 2026

⚡️ Codeflash found optimizations for this PR

📄 25% (0.25x) speedup for JavaAssertTransformer._generate_replacement in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 886 microseconds 708 microseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch cf-java-void-optimization).

Static Badge

claude bot added a commit that referenced this pull request Apr 3, 2026
…2026-04-03T13.47.39

⚡️ Speed up method `JavaAssertTransformer._generate_replacement` by 25% in PR #1980 (`cf-java-void-optimization`)
@codeflash-ai
Copy link
Copy Markdown
Contributor

codeflash-ai bot commented Apr 3, 2026

@HeshamHM28 HeshamHM28 force-pushed the cf-java-void-optimization branch from 10cfedf to 3bc4941 Compare April 3, 2026 14:50
@HeshamHM28 HeshamHM28 marked this pull request as ready for review April 3, 2026 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

workflow-modified This PR modifies GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant